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Abstract 

We  address  the  problem  of  model  checking  stochastic  systems,  i.e.  checking  whether  a  stochas¬ 
tic  system  satisfies  a  certain  temporal  property  with  a  probability  greater  (or  smaller)  than  a  fixed 
threshold.  In  particular,  we  present  a  novel  Statistical  Model  Checking  (SMC)  approach  based  on 
Bayesian  statistics.  We  show  that  our  approach  is  feasible  for  hybrid  systems  with  stochastic  transi¬ 
tions,  a  generalization  of  Simulink/Stateflow  models.  Standard  approaches  to  stochastic  (discrete) 
systems  require  numerical  solutions  for  large  optimization  problems  and  quickly  become  infeasi¬ 
ble  with  larger  state  spaces.  Generalizations  of  these  techniques  to  hybrid  systems  with  stochastic 
effects  are  even  more  challenging.  The  SMC  approach  was  pioneered  by  Younes  and  Simmons 
in  the  discrete  and  non-Bayesian  case.  It  solves  the  verification  problem  by  combining  random¬ 
ized  sampling  of  system  traces  (which  is  very  efficient  for  Simulink/Stateflow)  with  hypothesis 
testing  or  estimation.  We  believe  SMC  is  essential  for  scaling  up  to  large  Stateflow/Simulink  mod¬ 
els.  While  the  answer  to  the  verification  problem  is  not  guaranteed  to  be  correct,  we  prove  that 
Bayesian  SMC  can  make  the  probability  of  giving  a  wrong  answer  arbitrarily  small.  The  advantage 
is  that  answers  can  usually  be  obtained  much  faster  than  with  standard,  exhaustive  model  check¬ 
ing  techniques.  We  apply  our  Bayesian  SMC  approach  to  a  representative  example  of  stochastic 
discrete-time  hybrid  system  models  in  Stateflow/Simulink:  a  fuel  control  system  featuring  hybrid 
behavior  and  fault  tolerance.  We  show  that  our  technique  enables  faster  verification  than  state-of- 
the-art  statistical  techniques,  while  retaining  the  same  error  bounds.  We  emphasize  that  Bayesian 
SMC  is  by  no  means  restricted  to  Stateflow/Simulink  models:  we  have  in  fact  successfully  applied 
it  to  very  large  stochastic  models  from  Systems  Biology. 


1  Introduction 


Stochastic  effects  arise  naturally  in  hybrid  control  systems,  for  example,  because  of  uncertainties 
present  in  the  system  environment  (e.g.,  the  reliability  of  sensor  readings  and  actuator  effects  in 
control  systems,  the  impact  of  timing  inaccuracies,  the  reliability  of  communication  links  in  a 
wireless  sensor  network,  or  the  rate  of  message  arrivals  on  an  aircraft’s  communication  bus).  Un¬ 
certainty  can  be  modeled  via  a  probability  distribution,  thereby  resulting  in  a  stochastic  system, 
i.e.,  a  system  which  exhibits  probabilistic  behavior.  This  raises  the  question  of  how  to  verify  that  a 
stochastic  system  satisfies  a  certain  property.  For  example,  we  want  to  know  whether  the  probabil¬ 
ity  of  an  engine  controller  failing  to  provide  optimal  fuel/air  ratio  is  smaller  than  0.001;  or  whether 
the  ignition  succeeds  within  1ms  with  probability  at  least  0.99.  In  fact,  several  temporal  logics 
have  been  developed  in  order  to  express  these  and  other  types  of  probabilistic  properties  [3,  11,  1]. 
The  Probabilistic  Model  Checking  (PMC)  problem  is  to  decide  whether  a  stochastic  model  satis¬ 
fies  a  temporal  logic  property  with  a  probability  greater  than  or  equal  to  a  certain  threshold.  More 
formally,  suppose  fM  is  a  stochastic  model  over  a  set  of  states  S,  so  is  a  starting  state,  (|>  is  a  formula 
in  temporal  logic,  and  0  G  (0, 1)  is  a  probability  threshold.  The  PMC  problem  is:  to  decide  algo¬ 
rithmically  whether  M,sq  |=  P>q{(. |)),  i.e.,  to  decide  whether  the  model  M  starting  from  .so  satisfies 
the  property  (])  with  probability  at  least  0.  In  this  paper,  property  (])  is  expressed  in  Bounded  Lin¬ 
ear  Temporal  Logic  (BLTL),  a  variant  of  LTL  [21]  in  which  the  temporal  operators  are  equipped 
with  time  bounds.  Alternatively,  BLTL  can  be  viewed  as  a  sublogic  of  Koymans’  Metric  Temporal 
Logic  [16,  20].  As  system  models  fM,  we  use  a  stochastic  version  of  hybrid  systems  modeled  in 
Stateflow/Simulink. 

Existing  algorithms  for  solving  the  PMC  problem  fall  into  one  of  two  categories.  The  first 
category  comprises  numerical  methods  that  can  compute  the  probability  that  the  property  holds 
with  high  precision  (e.g.  [2,  3,  5,  6,  13]).  Numerical  methods  are  generally  only  suitable  for 
finite-state  systems  of  about  107  —  108  states  [17].  In  real  control  systems,  the  number  of  states 
easily  exceeds  this  limit  or  is  infinite,  which  motivates  the  need  for  algorithms  for  solving  the  PMC 
problem  in  a  probabilistic  fashion,  such  as  Statistical  Model  Checking  (SMC).  These  techniques 
heavily  rely  on  simulation  which,  especially  for  large,  complex  systems,  is  generally  easier  and 
faster  than  a  full  symbolic  study  of  the  system.  This  can  be  an  important  factor  for  industrial  sys¬ 
tems  designed  using  efficient  simulation  tools  like  Stateflow/Simulink.  Since  all  we  need  for  SMC 
are  simulations  of  the  system,  we  neither  have  to  translate  system  models  into  separate  verification 
tool  languages,  nor  have  to  build  symbolic  models  of  the  system  (e.g.,  Markov  chains)  appropriate 
for  numerical  methods.  This  simplifies  and  speeds  up  the  overall  verification  process.  The  most 
important  question,  however,  is  what  information  can  be  concluded  from  the  simulations  about  the 
overall  probability  that  (])  holds  for  M.  The  key  for  this  are  statistical  techniques  based  on  fair  (iid 
=  independent  and  identically  distributed)  sampling  of  system  behavior. 

Statistical  Model  Checking  treats  the  PMC  problem  as  a  statistical  inference  problem,  and 
solves  it  by  randomized  sampling  of  the  traces  (or  simulations)  from  the  model.  We  model  check 
each  sample  trace  separately  to  determine  whether  the  BLTL  property  (])  holds,  and  the  number 
of  satisfying  traces  is  used  to  decide  whether  M  |=  P>e((|)).  This  decision  is  made  by  means  of 
either  estimation  or  hypothesis  testing.  In  the  first  case  one  seeks  to  estimate  probabilistically  (i.e., 
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compute  with  high  probability  a  value  close  to)  the  probability  that  the  property  holds  and  then 
compare  that  estimate  to  0  [12,  23]  (in  statistics  such  estimates  are  known  as  confidence  intervals). 
In  the  second  case,  the  PMC  problem  is  directly  treated  as  a  hypothesis  testing  problem  (e.g., 
[27,  28]),  i.e.,  deciding  between  the  hypothesis  Hq\  M  \=  P>e(^)  that  the  property  holds  versus 
the  hypothesis  H\  :  fM  |=  P<e(<|>)  that  it  does  not. 

Hypothesis-testing  based  methods  are  more  efficient  than  those  based  on  estimation  when  0 
(which  is  specified  by  the  user)  is  significantly  different  from  the  true  probability  that  the  property 
holds  (which  is  determined  by  M  and  sq)  [26].  In  this  paper  we  show  that  estimation  can  be  much 
faster  for  probabilities  close  to  1 .  Also  note  that  Statistical  Model  Checking  cannot  guarantee  a 
correct  answer  to  the  PMC  problem.  The  most  crucial  question  needed  to  obtain  meaningful  results 
is  whether  the  probability  that  the  algorithm  gives  a  wrong  answer  can  be  bounded.  We  prove  that 
this  error  probability  can  indeed  be  bounded  arbitrarily  by  the  user. 

Our  SMC  approach  encompasses  both  hypothesis  testing  and  estimation,  and  it  is  based  on 
Bayes’  theorem  and  sequential  sampling.  Bayes’  theorem  enables  us  to  incorporate  prior  infor¬ 
mation  about  the  model  being  verified.  Sequential  sampling  means  that  the  number  of  sampled 
traces  is  not  fixed  a  priori,  but  our  algorithms  instead  determine  the  sample  size  at  “run-time”, 
depending  on  the  evidence  gathered  by  the  samples  seen  so  far.  Because  conclusive  information 
from  the  samples  can  be  used  to  stop  our  SMC  algorithms  as  early  as  possible,  this  often  leads  to 
significantly  smaller  number  of  sampled  traces  (simulations).  While  our  sequential  sampling  has 
many  practical  advantages  compared  to  fixed-size  sampling,  its  theoretical  analysis  is  significantly 
more  challenging. 

We  apply  our  approach  to  a  representative  example  of  discrete-time  stochastic  hybrid  system 
models  in  Stateflow/Simulink:  a  fault-tolerant  fuel  control  (hybrid)  system.  We  show  that  our 
approach  enables  faster  verification  than  state-of-the-art  techniques  based  on  statistical  methods. 

The  contributions  of  this  paper  are  as  follows: 

•  We  show  how  Statistical  Model  Checking  can  be  used  for  Stateflow/Simulink-style  hybrid  sys¬ 
tems  with  probabilistic  transitions. 

•  We  give  the  first  application  of  Bayesian  sequential  interval  estimation  to  Statistical  Model 
Checking. 

•  We  prove  analytic  error  bounds  for  our  Bayesian  sequential  hypothesis  testing  and  estimation 
algorithms. 

•  In  a  series  of  experiments  with  different  parameterizations  of  a  relevant  Simulink/Stateflow 
model,  we  empirically  show  that  our  sequential  estimation  method  performs  better  than  other 
estimation-based  Statistical  Model  Checking  approaches.  In  some  cases  our  algorithm  is  faster 
by  several  orders  of  magnitudes. 

While  the  theoretical  analysis  of  Statistical  Model  Checking  is  very  challenging,  a  beneficial  prop¬ 
erty  of  our  algorithms  is  that  they  are  easy  to  implement. 
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2  Background 


Our  algorithm  can  be  applied  to  any  stochastic  model  Ovt  for  which  it  is  possible  to  define  a  prob¬ 
ability  space  over  its  traces.  Several  stochastic  models  like  discrete/continuous  Markov  chains 
satisfy  this  property  [28].  Here  we  use  discrete-time  hybrid  systems  a  la  Stateflow/Simulink  with 
probabilistic  transitions. 


Discrete  Time  Hybrid  Systems  with  Probabilistic  Transitions  As  a  system  model,  we  con¬ 
sider  discrete  time  hybrid  systems  with  additional  probabilistic  transitions  (our  case  study  uses 
Stateflow/Simulink).  Such  a  model  M  gives  rise  to  a  transition  system  that  allows  for  discrete 
transitions  (e.g.,  from  one  Stateflow  node  to  another),  continuous  transitions  (when  following  dif¬ 
ferential  equations  underlying  Simulink  models),  and  probabilistic  transitions  (following  a  known 
probability  distribution).  For  Stateflow/Simulink,  a  state  assigns  real  values  to  all  the  state  vari¬ 
ables  and  identifies  the  current  discrete  state  (or  location)  for  Stateflow  machines. 

Formally,  we  start  with  a  definition  of  a  deterministic  automaton.  Then  we  augment  it  with 
probabilistic  transitions. 

Definition  1.  A  discrete-time  hybrid  automaton  (DTHA)  consists  of: 

•  a  continuous  state  space  E”; 

•  a  directed  graph  with  vertices  Q  (locations)  and  edges  E  ( control  switches); 

•  one  initial  state  (<?oAo)  £  Q  x  E”; 

•  flows  cp q(t;x)  G  M'1,  representing  the  state  reached  after  staying  in  location  q  for  time  t  >  0, 
starting  from  x  G  EE  ; 

•  jump  functions  jumpe  :  E'7  — >  W1  for  edges  e  G  E.  We  assume  jumpe  to  be  measurable  (preimages 
of  measurable  sets  under  jumpe  are  measurable). 

Definition  2.  The  transition  relation  for  a  deterministic  DTHA  is  defined  over  Q  x  E”  as 

(q}x)  (q,x)  (q,x) 


where 

•  Fort  G  E>o,  we  have  (q,x)  — >t  (q,x)  iffx  —  cp q(t;x); 

•  For  e  EE,  we  have  (q,x)  ~^e  (q,x)  iffx  =  jumpflx)  and  e  is  an  edge  from  q  to  q; 

•  A  :  Q  x  W1  — > ►  M>o  U  E  is  the  simulation  function. 
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The  simulation  function  A  makes  system  runs  deterministic  by  selecting  which  discrete  or 
continuous  transition  to  execute  from  the  respective  state  (q,x).  For  Stateflow/Simulink,  A  satisfies 
several  properties,  including  that  the  first  edge  e  (in  clockwise  orientation  in  the  graphical  notation) 
that  is  enabled  (i.e.,  where  a  jump  is  possible)  will  be  chosen.  Furthermore,  if  an  edge  is  enabled, 
a  discrete  transition  will  be  taken  rather  than  a  continuous  transition. 

Each  execution  of  a  DTHA  is  obtained  by  following  the  transition  relation  repeatedly  from 
state  to  state.  A  sequence  o  =  (so, to),  (s\,  t\), ...  of  Sj  G  Q  x  R'!  and  tt  G  R>o  is  called  trace  iff 
so  =  (qo,xo)  and  for  each  i  G  N,  Si  -^&(Si)  R+ 1  and: 

1.  ti  —  A  (si)  if  A  (si)  G  M>o  (continuous  transition),  or 

2.  ti  =  0  if  A(sj)  G  E  (discrete  transition). 

Thus  the  system  follows  transitions  from  st  to  .v,  f.  i .  If  this  transition  is  a  continuous  transition,  then 
ti  is  its  duration  A  (si),  otherwise  /,•  =  0  for  discrete  transitions.  In  particular,  the  global  time  at  state 
Sj  —  (qt,Xi)  is  Lo <i<ifl-  We  require  that  the  sum  Yq  h  must  diverge,  that  is,  the  system  cannot  make 
infinitely  many  state  switches  in  finite  time  (non-zeno).  We  denote  £o </<;?/  by  x(x,-),  because  we 
can  assume  there  is  one  state  variable  tracking  global  time. 

A  probabilistic  DTHA  is  obtained  from  a  DTHA  by  means  of  a  probabilistic  simulation  func¬ 
tion  instead  of  A.  Unlike  A,  it  selects  discrete  and  continuous  transitions  according  to  a  probability 
density.  The  state  of  a  probabilistic  DTHA  is  a  probability  density  function  on  Q  x  R”.  We  denote 
the  set  of  these  functions  by  D(Q  x  Mw). 

Definition  3.  The  transition  function  for  a  probabilistic  DTHA,  which  we  denote  by  — >,  maps  a 
( probabilistic )  state  p  G  D(Qx  R”)  to  p  G  D(Q  x  R'1)  with  p(q,x)  defined  as: 

f  f  p(q,x)Tl(q,x)(a)I^a^)(q,x)d(q,x)da 

R>0UEQxRn 


where 

•  n  :Q  x  R"  — >  D(R>o  UR)  is  the  (measurable)  probabilistic  simulation  function; 

•  x)  LS  the  indicator  function  of  the  preimage  of  —?u  at  (q,x),  i.e.,  I^a^^(q,x)  =  1  iff 
(q,x)  —*a  (q,x),  and  0  otherwise;  is  as  per  Definition  2. 

Well-definedness  of  the  integral  in  Def.  3  follows  directly  from  measurability  of  n  and  the 
jump  functions,  plus  the  fact  that  integration  over  time  can  be  restricted  to  a  bounded  interval  from 
0  to  the  current  time  x(x).  Note  that  initial  distributions  on  the  initial  state  can  be  obtained  easily 
by  prefixing  the  system  with  a  probabilistic  transition  from  xo-  Sample  traces  of  a  probabilistic 
DTHA  can  be  obtained  by  sampling  from  the  traces  generated  by  n. 
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Specifying  Properties  in  Temporal  Logic  Our  algorithm  verifies  properties  of  M  expressed  as 
formulas  in  Probabilistic  Bounded  Linear  Temporal  Logic  PBLTL).  We  first  define  the  syntax  and 
semantics  of  Bounded  Linear  Temporal  Logic  (BLTL),  which  we  can  check  on  a  single  trace,  and 
then  extend  that  logic  to  PBLTL.  Finkbeiner  and  Sipma  [8]  have  defined  a  variant  of  LTL  on  finite 
traces  of  discrete-event  systems  (where  time  is  thus  not  considered). 

For  a  stochastic  model  M ,  let  the  set  of  state  variables  SV  be  a  finite  set  of  real- valued  variables. 
A  Boolean  predicate  over  SV  is  a  constraint  of  the  form  y~v,  where  y  £  SV,  ~  £  {>,<,=},  and 
vGi  A  BLTL  property  is  built  on  a  finite  set  of  Boolean  predicates  over  SV  using  Boolean 
connectives  and  temporal  operators.  The  syntax  of  the  logic  is  given  by  the  following  grammar: 

<t>  :;=  y~v|  (4>i  v §2)  I  (<h  a 4>2)  |  ~4i  | 

where  rs_/  £  {>,<,=},  y  £  SV,  v  £  Q,  and  t  £  Q>o-  As  usual,  we  can  define  additional  temporal 
operators  such  as  =  TrueU1^)/,  or  Gl\|/  =  — 'F1 — 1X4/  by  bounded  untils  U*. 

We  define  the  semantics  of  BLTL  with  respect  to  executions  of  M .  The  fact  that  an  execution 
o  satisfies  property  (f>  is  denoted  by  o  \=  (]).  We  denote  the  trace  suffix  starting  at  step  i  by  o'  (in 

particular,  o°  denotes  the  original  execution  o).  We  denote  the  value  of  the  state  variable  y  in  o  at 

step  i  by  V  (a,  i,y). 

Definition  4.  The  semantics  of  BLTL  for  a  trace  <Jk  starting  at  the  kth  state  (k£  N)  is  defined  as 
follows: 

•  <Jk  |=  y  ~  v  if  and  only  ifV  ( <J,k,y )  ~  v; 

•  <Jk  |=  <f>i  V  <f>2  if  and  only  ifak  \=  <f>i  or  <5k  |=  ([>2; 

•  <Jk  |=  <f>i  A  <f»2  if  and  only  ifak  \=  4>i  and  <5k  \=  <f»2,' 

•  ok  |=  — i<])i  if  and  only  ifak  |=  (|)  1  does  not  hold  (written  ak  \f  (|)|  J; 

•  ak  (f)  1  Urc|)2  if  and  only  if  there  exists  i  £  N  such  that  (a)  Lo  <l<fk+l  A  t,  (b)  ak+l  |=  <f»2  and  (c) 
for  each  0  <  /  <  i,  ak  l  J  |=  <}>i. 

Statistical  Model  Checking  decides  probabilistic  Model  Checking  by  repeatedly  checking  whether 
<7  |=  <t>  holds  on  sample  simulations  o  of  the  system.  In  practice,  sample  simulations  only  have  a 
finite  duration.  The  question  is  how  long  these  simulations  have  to  be  for  the  formula  4>  to  have 
a  well-defined  semantics  such  that  o  |=  <|)  can  be  checked.  If  o  is  too  short,  say  of  duration  2,  the 
semantics  of  (|)iU5(|)2  may  be  unclear.  But  at  what  duration  of  the  simulation  can  we  stop  because 
we  know  that  the  truth-value  for  o  |=  ()  will  never  change  by  continuing  the  simulation?  Is  the 
number  of  required  simulation  steps  expected  to  be  finite  at  all? 

For  a  class  of  finite  length  continuous-time  boolean  signals,  well-definedness  of  checking 
bounded  MITL  properties  has  been  conjectured  in  [19].  Here  we  generalize  to  infinite,  hybrid 
traces  with  real- valued  signals.  We  prove  well-definedness  and  the  fact  that  a  finite  prefix  of  the 
discrete  time  hybrid  signal  is  sufficient  for  BLTL  model  checking,  which  is  crucial  for  termination. 
It  especially  turns  out  that  divergence  of  time  ensures  termination  of  SMC. 

Lemma  1  (Bounded  sampling).  The  problem  “o  |=  <f) ”  is  well-defined  and  can  be  checked  for 
BLTL  formulas  (|>  and  traces  o  based  on  only  a  finite  prefix  0/0  of  bounded  duration. 
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For  proving  Lemma  1  we  need  to  derive  bounds  on  when  to  stop  simulation.  Those  bounds  can 
be  read  off  easily  from  the  BLTL  formula: 


Definition  5.  We  define  the  sampling  bound  #((|))  e  Q>o  of  a  BLTL  formula  (])  inductively  as  the 
maximum  nested  sum  of  time  bounds: 

#(; y  ~  v)  :=  0 

#(H>0  :=  #(4>t) 

#(<(> l  V  <J>2 )  :=  max(#(<|)i) , #(<t)2)) 

#C<t> l  A <t»2)  :=  max(#(<|>i),#(<|>2)) 

#(<f)iUr4>2)  :=  r  +  max(#(^i),#(^2)) 


Unlike  infinite  traces,  actual  system  simulations  need  to  be  finite  in  length.  We  prove  that  the 
semantics  of  BLTL  formulas  (f>  is  well-defined  on  finite  prefixes  of  traces  with  a  duration  that  is 
bounded  by  #(<|)) . 

Lemma  2  (BLTL  on  bounded  simulation  traces).  Let  (])  he  a  BLTL  formula,  UN.  Then  for  any 
two  infinite  traces  o  =  (voTo),  (nTi),  •  •  •  and  6  =  (so,  to),  (,f| .  t\  ), . . .  with 

sk+i  =  h+i  an d  tk+i  =  tk+i  V/  e  N  with  £  tk+i  <  #(<|>)  (1) 

0</<7 


we  have  that 

ak  iff  ak  \=  4>  . 

Proof  The  proof  is  by  induction  on  the  structure  of  the  BLTL  formula  4).  IH  is  short  for  induction 
hypothesis. 

1.  If  4>  is  of  the  form  y  ~  v,  then  ck  \=  y  ~  v  iff  dk  |—  y  ~  v,  because  s^  —  by  using  (1)  for  i  —  0. 

2.  If  4>  is  of  the  form  4>i  V  4>2,  then 

vk  1=  4>t  v  4>2 

iff  ak  \=  4>i  or  ck  \=  4>2 

iff  ak  \=  4>i  or  6k  \=  4>2  by  IH  as  #(4>i  V  <|)2)  >  #(4>i) 

and  #((|)i  V  4)2)  >  #(4>2) 

iff  ok  \=  4>t  v  4>2 

The  proof  is  similar  for  -41  and  4>i  A  4>2- 


3.  If  4>  is  of  the  form  4>iUr(f>2,  then  ak  \=  4>iUr<f>2  iff  conditions  ( a),(b),(c )  of  Definition  4  hold. 
Those  conditions  are  equivalent,  respectively,  to  the  following  conditions  (a'),(b'),(c')\ 

(a')  Lo <l<ih+l  <  t,  because  #(4>iUr(|)2)  >  t  such  that  the  durations  of  trace  o  and  0  are  tk+i  —  h+l 
for  each  index  /  with  0  <  /  <  i  by  assumption  (1). 
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(i b ')  ok+l  \=  (|)2  by  induction  hypothesis  as  follows:  We  know  that  the  traces  o  and  o  match  at  k  for 
duration  #((|)iU'(|)2)  and  need  to  show  that  the  semantics  of  cf) i U^(|)2  matches  at  k.  By  IH  we 
know  that  <f>2  has  the  same  semantics  at  k  +  i  (that  is  dk+l  |=  <f>2  iff  ok+l  |=  4>2)  provided  that 
we  can  show  that  the  traces  o  and  o  match  at  k  +  i  for  duration  #{§2)-  F°r  this,  consider  any 
I  e  N  with  Lo <i<jtk+i+i  <  #(<h)-  Then 

#(^>2)  >  52  tk+i+i  =  H  tk+ 1  -  H  tk+ 1 

0  </</  0<l<i+I  0  <l<i 

(a) 

>  52  tk+l  ~ f 
0  </<('+/ 

Thus 

52  *k+i  —  * +^(^2) 

0  <l<i+I 

<  t  +  max(#(<|>1),#((|>2))  =#(([)iU>2) 

As  /  e  N  was  arbitrary,  we  conclude  from  this  with  assumption  (1)  that,  indeed  sj  —  sj  and 
tj  —  tj  for  all  I  E  N  with 

52  tk+i+l  -  #(^2) 

0  <1<I 

Thus  the  IH  for  <|)2  yields  the  equivalence  of  ok+l  \=  <f>2  and  ok  1 '  |=  <f»2  when  using  the  equiv¬ 
alence  of  ( a )  and  {a'). 

(c')  for  each  0  <j<i,  ck+j  [=  4>  1 .  The  proof  of  equivalence  to  (c)  is  similar  to  that  for  (br)  using 

j  <  i- 

The  existence  of  an  i  e  N  for  which  these  conditions  hold  is  equivalent  to  ak  \=  (|)  [  U'  cf>2 -  □ 

Now  we  prove  that  Lemma  1  holds  using  prefixes  of  traces  according  to  the  sampling  bound  #(<|)), 
which  guarantees  that  finite  simulations  are  sufficient  for  deciding  (]). 

Proof  of  Lemma  1.  According  to  Lemma  2,  the  decision  “o  \=  <f>'’  is  uniquely  determined  (and 
well-defined)  by  considering  only  a  prefix  of  o  of  duration  #((|))  e  Q>o-  By  divergence  of  time, 
o  reaches  or  exceeds  this  duration  #((f>)  in  some  finite  number  of  steps  n.  Let  o'  denote  a  finite 
prefix  of  o  of  length  n  such  that  £0 <i<nL  >  #((l))-  Again  by  Lemma  2,  the  semantics  of  o'  \=  <()  is 
well-defined  because  any  extension  a"  of  o'  satisfies  a"  \=  (|)  if  and  only  if  o'  \=  (|).  Consequently 
the  semantics  of  o'  |=  <]>  coincides  with  the  semantics  of  o  |=  (f>.  On  the  finite  trace  o',  it  is  easy  to 
see  that  BLTL  is  decidable  by  evaluating  the  atomic  formulas  x  ~  v  at  each  state  .y,  of  the  system 
simulation.  □ 

We  now  define  Probabilistic  Bounded  Linear  Temporal  Logic. 

Definition  6.  A  Probabilistic  Bounded  LTL  (PBLTL)  formula  is  a  formula  of  the  form  P>e (([>), 
where  (|)  is  a  BLTL  formula  and  0  G  (0, 1)  is  a  probability. 
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We  say  that  M  satisfies  PBLTL  property  P>q{( f>) ,  denoted  by  \=  P>q{( |)),  if  and  only  if  the  prob¬ 
ability  that  an  execution  trace  of  M  satisfies  BLTL  property  <f>  is  greater  than  or  equal  to  0.  This 
problem  is  well-defined,  because,  by  Lemma  1,  each  o  |=  (])  is  decidable  on  a  finite  prefix  of  o,  finite 
iterations  of  the  probabilistic  transition  function  (Def.  3)  gives  a  well-defined  probability  measure, 
and,  thus,  a  corresponding  probability  measure  can  be  associated  to  the  set  of  all  (non-zeno)  exe¬ 
cutions  of  M  that  satisfy  a  BLTL  formula  [28].  Note  that  counterexamples  to  the  BLTL  property 
4>  are  not  counterexamples  to  the  PBLTL  property  P>e (<)>),  because  the  truth  of  P>q(§)  depends  on 
the  likelihood  of  all  counterexamples  to  (f>.  This  makes  PMC  more  difficult  than  standard  Model 
Checking,  because  one  counterexample  to  (f>  is  not  enough  to  decide  P>e  (<|>). 


3  Bayesian  Interval  Estimation 

We  present  our  new  Bayesian  statistical  estimation  algorithm.  In  this  approach  we  are  interested 
in  estimating  p,  the  (unknown)  probability  that  an  execution  trace  of  M  satisfies  a  given  BLTL 
property.  The  estimate  will  be  in  the  form  of  a  confidence  interval,  i.e.,  an  interval  which  will 
contain  p  with  arbitrarily  high  probability. 

Recall  that  the  PMC  problem  is  to  decide  whether  M  \=  P>q{(. |)),  where  0  G  (0, 1)  and  (])  is  a 
BLTL  formula.  Let  p  be  the  (unknown  but  fixed)  probability  of  the  model  satisfying  (f>:  thus,  the 
PMC  problem  can  now  be  stated  as  deciding  between  two  hypotheses: 

H0:p^Q  H\:p<Q.  (2) 

For  any  trace  o,  of  the  system  M.  we  can  deterministically  decide  whether  o,  satisfies  BLTL 
formula  (f>.  Therefore,  we  can  define  a  Bernoulli  random  variable  X,  denoting  the  outcome  of 
o,  <f).  The  conditional  probability  mass  function  associated  with  Xj  is  thus: 

Vw  G  [0, 1]  f{xi\u)  =  uXi(  1  -  u)'~Xi  (3) 

where  Xi  —  1  iff  o,  |=  4>,  otherwise  x-,  =  0.  Note  that  the  A,  are  (conditionally)  independent  and 
identically  distributed  (iid),  as  each  trace  is  given  by  an  independent  execution  of  the  model.  Since 
p  is  unknown,  we  may  assume  that  it  is  given  by  a  random  variable,  whose  density  g(-)  is  called 
the  prior  density.  The  prior  is  usually  based  on  our  previous  experiences  and  beliefs  about  the 
system.  A  lack  of  information  about  the  probability  of  the  system  satisfying  the  formula  is  usu¬ 
ally  summarized  by  a  non-inf ormative  or  objective  prior  (see  [22,  Section  3.5]  for  an  in-depth 
treatment). 

Since  p  lies  in  [0, 1],  we  need  prior  densities  defined  over  this  interval.  In  this  paper  we  focus 
on  Beta  priors  which  are  defined  by  the  following  probability  density  (for  real  parameters  a,  [3  >  0 
that  give  various  shapes): 

Vw  G  [0, 1]  gO,a,[3)=  1  (4) 

B(a,  p) 
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(5) 


where  the  Beta  function  B( a,  (3)  is  defined  as: 

B(a,p)  =  C . 

Jo 

By  varying  the  parameters  a  and  [3.  one  can  approximate  other  smooth  unimodal  densities  on  (0, 1) 
by  a  Beta  density  (e.g.,  the  uniform  density  over  (0, 1)  is  a  Beta  with  a  =  p  =  1).  For  all  u  €  [0, 1] 
the  Beta  distribution  function  F(a,p)(M)  is  defined: 

F(afi)(u)  =  fQ  g(t,a,p)  <*t=  ^  ^  ^  ta_1(l-f)p_1  dt  (6) 

which  is  the  usual  distribution  function  for  a  Beta  random  variable  of  parameters  a,  P  (/.<?.,  the 
probability  that  it  takes  values  less  than  or  equal  to  u). 

In  addition  to  their  flexible  shapes  for  various  choices  of  a,  p,  the  advantage  of  using  Beta 
densities  is  that  the  Beta  distribution  is  the  conjugate  prior  to  the  Bernoulli  distribution1.  This 
relationship  enables  us  to  avoid  numerical  integration  in  the  implementation  of  both  the  Bayesian 
estimation  and  hypothesis  testing  algorithms,  as  we  next  explain. 


3.1  Bayesian  Intervals 


Bayes’  theorem  states  that  if  we  sample  from  a  density  f(-\u),  where  u  (the  unknown  probability) 
is  given  by  a  random  variable  U  over  (0, 1)  whose  density  is  g(-),  then  the  posterior  density  of  U 
given  the  data  x\, . . .  ,xn  is: 


f(u \x\,...,x„) 


f(xi,...,xn\u)g(u) 

Jo /(*t, •••,*« |v)g(v)  dv 


(7) 


and  in  our  case  f(x  i  ,...,x„\  u)  factorizes  as  n”=t  f{xi\u),  where  fixjii)  is  the  Bernoulli  mass  func¬ 
tion  (3)  associated  with  the  i-th  sample  (remember  that  we  assume  conditionally  independent, 
identically  distributed  -  iid  -  samples).  Since  the  posterior  is  an  actual  distribution  (note  the  nor¬ 
malization  constant),  we  can  estimate  p  by  the  mean  of  the  posterior.  In  fact,  the  posterior  mean 
is  a  posterior  Bayes  estimator  of  p,  i.e.,  it  minimizes  the  risk  over  the  whole  parameter  space  of  p 
(under  a  quadratic  loss  function,  see  [7,  Chapter  8]). 

For  a  coverage  goal  c  G  (5, 1),  any  interval  (to,ti)  such  that 


(8) 


is  called  a  100c  percent  Bayesian  intenml  estimate  of  p.  Naturally,  one  would  choose  to  and  t\  that 
minimize  t\  —  to  and  satisfy  (8),  thus  determining  an  optimal  interval.  Note  that  to  and  t\  are  in  fact 
functions  of  the  sample  x\ , . . .  ,xn. 

’A  distribution  P(0)  is  said  to  be  a  conjugate  prior  for  a  likelihood  function,  P(c/|0),  if  the  posterior,  P(Q\d)  is  in 
the  same  family  of  distributions. 
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Optimal  interval  estimates  can  be  found,  for  example,  for  the  mean  of  a  normal  distribution 
with  normal  prior,  where  the  resulting  posterior  is  normal.  In  general,  however,  it  is  difficult  to 
find  optimal  interval  estimates.  For  unimodal  posterior  densities  like,  we  can  use  the  posterior’s 
mean  as  the  “center”  of  an  interval  estimate. 


Here,  we  do  not  pursue  the  computation  of  an  optimal  interval,  which  may  be  numerically 
infeasible.  Instead,  we  fix  a  desired  half-interval  width  8  and  then  sample  until  the  probability 
mass  of  an  interval  estimate  of  width  28  containing  the  posterior  mean  exceeds  c.  When  sampling 
from  a  Bernoulli  distribution  and  with  a  Beta  prior  of  parameters  a,  (1.  it  is  known  that  the  mean  p 
of  the  posterior  is: 


P  = 


X~\-  CL 


(9) 


n  -f-  CL  |3 

where  x  =  £”=1  Xj  is  the  number  of  successes  in  the  sampled  data  v  i . . . . ,  xn.  The  integral  in  (8)  can 
further  be  computed  easily  in  terms  of  the  Beta  distribution  function. 


Proposition  1.  Let  (to,  ft)  be  an  interval  in  [0, 1].  The  posterior  probability  of  Bernoulli  iid  samples 
(x  i , . . . ,  xn )  and  Beta  prior  of  parameters  CL,  (1  can  be  calculated  as: 


i  tQ 


f(u\xi,...,Xn)du  F(x-\-a,,n—x-V$)  (/l )  ^(;«-T(x,w— x+|3)  (O) 


(10) 


where  x  =  YJi=\xi LS  the  number  of  successes  in  (x  i , . . . ,  xn )  and  F(f)  is  the  Beta  distribution  func¬ 
tion. 


Proof  Direct  from  definition  of  Beta  distribution  function  (6)  and  the  fact  that  the  posterior  density 
is  a  Beta  of  parameters  x+CL  and  n  —  x  +  p.  □ 

The  Beta  distribution  function  can  be  computed  with  high  accuracy  by  standard  mathematical 
libraries  {e.g.  the  GNU  Scientific  Library)  or  software  (e.g.  Matlab).  Hence,  the  Beta  distribution 
is  the  appropriate  choice  for  summarizing  the  prior  distribution  in  Statistical  Model  Checking. 


3.2  Bayesian  Estimation  Algorithm 


We  want  to  compute  an  interval  estimate  of  p  —  Prob(  fVf  |=  (|)),  where  cf)  is  a  BLTL  formula  and 
M  a  stochastic  hybrid  system  model  -  remember  from  our  discussion  in  Section  2  that  p  is  well- 
defined.  Fix  the  half-size  8  e  (0,  j)  of  the  desired  interval  estimate  for  p,  the  coefficient  c  G  (^.  1 ) 
to  be  used  in  (8),  and  the  coefficients  a,  p  of  the  Beta  prior. 


Our  algorithm  iteratively  draws  iid  sample  traces  O] ,  o 2, . . .,  and  checks  whether  they  satisfy  <f). 
At  stage  n,  the  algorithm  computes  p.  the  Bayes  estimator  for  p  (i.e.,  the  posterior  mean)  according 
to  (9).  Next,  using  to  =  P  —  8,  t\  —  p  +  8  it  computes 


Y  = 


. . .  .xn)  du  . 


If  y  ^  c  it  stops  and  returns  to,t\  and  p\  otherwise  it  samples  another  trace  and  repeats.  One  should 
pay  attention  at  the  extreme  points  of  the  (0, 1)  interval,  but  those  are  easily  taken  care  of,  as  shown 
in  Algorithm  1. 


10 


Algorithm  1  Statistical  Model  Checking  by  Bayesian  Interval  Estimates 

Require:  BLTL  Property  <f>,  half-interval  size  8  e  (0,  j),  interval  coefficient  c  G  (|.  1),  Prior  Beta 
distribution  with  parameters  a,  (1 


n  :=  0  { number  of  traces  drawn  so  far] 

x  :=  0  { number  of  traces  satisfying  <f>  so  far] 

repeat 

o  :=  draw  a  sample  trace  of  the  system  (iid) 

77  :=  n  +  1 


if  <7  |=  <t>  then 
x  :=  x+  1 

end  if 

p  :=  (x  +  a)/(/7  +  a  +  |3) 
(t0,C)  :=  (p-S,p  +  8) 

if  t\  >  1  then 

(fo,fi)  :=  (1-2-6, 1) 

else  if  to  <  0  then 

(fo,ft):=  (0,2-5) 

end  if 

y  :=  PostcriorProb(/()./i  j 

until  (y  ^  c) 
return  (tofi),p 


{compute posterior  mean] 
{compute  interx’al  estimate] 


{compute  posterior  probability  of  p^(t o,?i),  by  (10)] 


4  Bayesian  Hypothesis  Testing 


In  this  section  we  briefly  present  our  sequential  Bayesian  hypothesis  test,  which  was  introduced  in 
[15].  Let  Xi, . . .  ,Xn  be  a  sequence  of  Bernoulli  random  variables  defined  as  for  the  PMC  problem 
in  Sect.  3,  and  let  d  =  (x i, . . .  ,xn)  denote  a  sample  of  those  variables.  Let  Hq  and  H\  be  mutually 
exclusive  hypotheses  over  the  random  variable’s  parameter  space  according  to  (2).  Suppose  the 
prior  probabilities  P(Hq)  and  P(H\ )  are  strictly  positive  and  satisfy  P(Hq)  +P(H i)  =  1.  Bayes’ 
theorem  states  that  the  posterior  probabilities  are 


P(H0\d) 


P(d\H0)P(H0) 

P(d) 


P(Hi\d) 


PidffjPiHQ 

P(d) 


(ID 


for  every  d  with  P(d)  =  P(d\Ho)P(Ho)  +  P(d\Hi)P(Hi)  >  0.  In  our  case  P(d)  is  always  non-zero 
(there  are  no  impossible  finite  sequences  of  outcomes). 
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4.1  Bayes  Factor 

By  Bayes’  theorem,  the  posterior  odds  for  hypothesis  Hq  is 

P(H0\d)  _  P(d\H0 )  P(H0) 

P(Hi\d)  P(d\Hx)  '  P{HX)  ' 

Definition  7.  The  Bayes  factor  T>  of  sample  d  and  hypotheses  Hq  and  H\  is 

p(4Ho) 

P(d\HQ  ' 


For  fixed  priors  in  a  given  example,  the  Bayes  factor  is  directly  proportional  to  the  posterior 
odds  by  (12).  Thus,  it  may  be  used  as  a  measure  of  relative  confidence  in  Hq  vs.  H\,  as  proposed 
by  Jeffreys  [14].  To  test  Hq  vs.  H\,  we  compute  the  Bayes  factor  B  of  the  available  data  d  and 
then  compare  it  against  a  fixed  threshold  T  ^  1:  we  shall  accept  Hq  iff  B  >  T .  Jeffreys  interprets 
the  value  of  the  Bayes  factor  as  a  measure  of  the  evidence  in  favor  of  Hq  (dually,  ^  is  the  evidence 
in  favor  of  H i).  Classically,  a  fixed  number  of  samples  was  suggested  for  deciding  Hq  vs.  H\.  We 
develop  an  algorithm  that  chooses  the  number  of  samples  adaptively. 

We  now  show  how  to  compute  the  Bayes  factor.  According  to  Definition  7,  we  have  to  calculate 
the  ratio  of  the  probabilities  of  the  observed  sample  d={x  i, . . . ,  x„ )  given  Hq  and  H\ .  By  (12),  this 
ratio  is  proportional  to  the  ratio  of  the  posterior  probabilities,  which  can  be  computed  from  Bayes’ 
theorem  (7)  by  integrating  the  joint  density  f(x 1 1  • )  •  •  •  f(xn  \  ■ )  with  respect  to  the  prior  g( ■ ) : 

P{Ho\xh...,xn)  _  Jq  f{u\xi,...,xn)  du  _  Jq  f(xi\u)  ■  ■  ■  f(xn\u)  ■  g{u)  du 

P(Hi\x\,. .  .,xn)  f^  f(u\x i , . . . , xn)  du  Jq  f(x\ \u)  ■  ■  ■  f{xn\u)  ■  g(u)  du 

Thus,  the  Bayes  factor  is: 

P{H0\xh...,xn)  Til  fQf(xi\u)---f(x„\u)-g(u)du  ,13) 

fto  P(Hi\xi,..  ■  ,xn)  7to  Jq  f(x\ \u)  ■  ■  ■f(xn\u)  ■  g(u)  du 

where  7to  =  P{Hq)  =  /e! g(u)  du,  and  Hi  =  P(H\)  =  1  —  7Co-  We  observe  that  the  Bayes  factor 

depends  on  the  data  d  and  on  the  prior  g,  so  it  may  be  considered  a  measure  of  confidence  in  Hq 

vs.  H\  provided  by  the  data  x\,.. .  ,xn,  and  “weighted”  by  the  prior  g.  When  using  Beta  priors,  the 
calculation  of  the  Bayes  factor  can  be  much  simplified. 

Proposition  2.  The  Bayes  factor  of  Hq  :  p  ^  0  vs.  H\  :  p  <  9  with  Bernoulli  samples  (x  i , . . . ,  xn ) 
and  Beta  prior  of  parameters  a,  [1  is: 


ttl 

Tto 


where  x  —  Y!i=\xi  is  the  number  of  successes  in  (x i , . . . , x„ )  and  /7(  s  ,j(-)  is  the  Beta  distribution 
function  of  parameters  s,t. 
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4.2  Bayesian  Hypothesis  Testing  Algorithm 


Our  algorithm  generalizes  Jeffreys’  test  to  a  sequential  version.  Remember  we  want  to  establish 
whether  M  |=  q((|)),  where  0  e  (0, 1)  and  (|>  is  a  BLTL  formula.  The  algorithm  iteratively  draws 

independent  and  identically  distributed  sample  traces  G| ,  02, and  checks  whether  they  satisfy  (f>. 
We  can  model  this  procedure  as  independent  sampling  from  a  Bernoulli  distribution  X  of  unknown 
parameter  p  -  the  actual  probability  of  the  model  satisfying  (f).  At  stage  n  the  algorithm  has  drawn 
samples  xi, . . . . xn  iid  like  X.  It  then  computes  the  Bayes  factor  B  according  to  Proposition  2,  to 
check  if  it  has  obtained  conclusive  evidence.  The  algorithm  accepts  Hq  iff  ‘By  T,  and  accepts  H  \ 
iff  B  <  Otherwise  ^  B  ^  T)  it  continues  drawing  iid  samples.  This  algorithm  is  shown  in 
Algorithm  2. 


Algorithm  2  Statistical  Model  Checking  by  Bayesian  Hypothesis  Testing 

Require:  PBLTL  Property  P^e((|)),  Threshold  T  ^  1,  Prior  density  g  for  unknown  parameter  p 

n  :=  0  { number  of  traces  drawn  so  far} 

x  :=  0  { number  of  traces  satisfying  (])  so  far} 

loop 

o  :=  draw  a  sample  trace  of  the  system  (iid) 
n  :=  n  +  1 

if  o  |=  (f)  then 
x  :-x  + l 

end  if 

B  :=  B  ay  e  sFac  tor  (« ,  x )  {compute  as  in  Proposition  2} 

if  ( By  T )  then 
return  Ho  accepted 
else  if  (B  <  y)  then 
return  H\  accepted 

end  if 
end  loop 


5  Analysis 

Statistical  Model  Checking  algorithms  are  easy  to  implement  and — because  they  are  based  on 
selective  system  simulation — enjoy  promising  scalability  properties.  Yet,  for  the  same  reason, 
their  output  would  be  useless  outside  the  sampled  traces,  unless  the  probability  of  making  an  error 
during  the  PMC  decision  can  be  bounded. 

As  our  main  contribution,  we  prove  error  bounds  for  Statistical  Model  Checking  by  Bayesian 
sequential  hypothesis  testing  and  by  Bayesian  interval  estimation.  In  particular,  we  show  that  the 
(Bayesian)  Type  I-II  error  probabilities  for  the  algorithms  in  Sect.  3-4  can  be  bounded  arbitrarily. 
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We  recall  that  a  Type  I  (II)  error  occurs  when  we  reject  (accept)  the  null  hypothesis  although  it  is 
true  (false). 

Theorem  1  (Error  bound  for  hypothesis  testing).  For  any  discrete  random  variable  and  prior,  the 
probability  of  a  Type  I-II  error  for  the  Bayesian  hypothesis  testing  algorithm  2  is  bounded  above 
by  j,  where  T  is  the  Bayes  Factor  threshold  given  as  input. 


Proof  We  present  the  proof  for  Type  I  error  only  -  for  Type  II  it  is  very  similar.  A  Type  I  error 
occurs  when  the  null  hypothesis  Hq  is  true,  but  we  reject  it.  We  then  want  to  bound  P(reject  Ho  \ 
Hq).  If  the  Bayesian  algorithm  2  stops  at  step  n,  then  it  will  accept  Hq  if  B(d)  >  T,  and  reject  Hq 
if  B(d)  <  j,  where  d  —  (x i, . . .  ,xn)  is  the  data  sample,  and  the  Bayes  Factor  is 


Bid) 


P(d\H0) 
P{d\Hi)  ' 


The  event  {reject  //q}  is  formally  defined  as 


{reject  H0}  =  [J  {B{d)  <  ^  A  D  =  d} 
deO.  1 


(14) 


where  D  is  the  random  variable  denoting  a  sequence  of  n  discrete  random  variables,  and  Q  is  the 
sample  space  of  D  -  i.e.,  the  (countable)  set  of  all  the  possible  realizations  of  D  (in  our  case  D  is 
clearly  finite).  We  now  reason: 


P(reject//o  Hq) 

= 

(14) 

P(Udz£lW)<7  ^D  =  d} 

\  Ho) 

= 

additivity 

LdeaPd^id)  <T  A  D  =  d) 

I  Ho) 

= 

independent  events 

!UznPW)<i).P(p  =  d  | 

i 

Ho) 

B(d)  <±ifiP(D  =  d\H0)  <  ±P(D  =  d\Hl) 

I d&T-P(P  =  d  |  HX) 

= 

additivity  and  independence 

7-p(Vd<=aD  =  d  |  H,) 

= 

universal  event 

jr-P(n  1  HX)  =  d, 

□ 


14 


Note  that  the  bound  j  is  independent  from  the  prior  used. 

Next,  we  lift  the  error  bounds  found  in  Theorem  1  for  Algorithm  2  to  Algorithm  1  by  repre¬ 
senting  the  output  of  the  Bayesian  interval  estimation  algorithm  1  as  a  hypothesis  testing  problem. 
We  use  the  output  interval  (toTi)  °f  the  estimation  algorithm  1  to  define  the  (null)  hypothesis 
Hq  :  p  E  (/()•/])•  Now  Ho  represents  the  hypothesis  that  the  output  of  algorithm  1  is  correct.  Then, 
we  can  test  Ho  and  determine  bounds  on  Type  I  and  II  errors  by  Theorem  1.  We  prove  that  these 
errors  can  be  bounded  by  the  user. 

Theorem  2  (Error  bound  for  estimation).  For  any  discrete  random  variable  and  prior,  the  Type 
I  and  II  errors  for  the  output  interval  {to- 1  \ )  of  the  Bayesian  estimation  algorithm  1  are  bounded 
above  by  j  ,  where  c  is  the  coverage  coefficient  given  as  input  and  7To  is  the  prior  probability 
of  the  hypothesis  Hq  :  p  G  {to,t\). 


Proof  Let  (?o,?i)  be  the  interval  estimate  when  the  estimation  algorithm  1  terminates  (with  cover¬ 
age  c).  From  the  hypothesis 

H0  :  p  G  OoTi)  (15) 

we  compute  the  Bayes  factor  for  Hq  vs.  the  alternate  hypothesis  H\  :  p  f  (jo-ti).  Then  we  use 
Theorem  1  to  derive  the  bounds  on  the  Type  I  and  II  error.  If  the  estimation  algorithm  1  terminates 
at  step  n  with  output  to,  h,  we  have  that: 


(16) 


and  therefore  (since  the  posterior  is  a  distribution): 


. .  .,xn)  du  ^  1 


—  c. 


The  Bayes  factor  of  Hq  vs  H j  is,  by  (13): 
(1-Jtp)  fHof(u 1*1, •••,*«)  du 

no  fHlf(u\xh---,xn)du 

( 1  —  7Cp)  C 
TUo  1  ~C 


(17) 


by  (16)  and  (17) 


Therefore,  by  Theorem  1  the  error  is  bounded  above  by 


(  c(i— 7to)\  1 
\  (1— c)Jto  / 


(l-c)jtQ 

c(l-Jto)' 


□ 
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6  Application 


We  study  an  example  that  is  part  of  the  Stateflow/Simulink  package.  The  model2  describes  a 
fuel  controller  system  for  a  gasoline  engine.  It  detects  sensor  failures,  and  dynamically  changes 
the  control  law  to  provide  seamless  operation.  A  key  quantity  in  the  model  is  the  ratio  between 
the  air  mass  flow  rate  (from  the  intake  manifold)  and  the  fuel  mass  flow  rate  (as  pumped  by  the 
injectors).  The  system  aims  at  keeping  the  air-fuel  ratio  close  to  the  stoichiometric  ratio  of  14.6, 
which  represents  an  acceptable  compromise  between  performance  and  fuel  consumption.  The 
system  estimates  the  “correct”  fuel  rate  giving  the  target  stoichiometric  ratio  by  taking  into  account 
sensor  readings  for  the  amount  of  oxygen  present  in  the  exhaust  gas  -  Exahust  Gas  Oxygen  (EGO) 
-  for  the  engine  speed,  throttle  command  and  manifold  absolute  pressure.  In  the  event  of  a  single 
sensor  fault,  the  system  detects  the  situation  and  operates  the  engine  with  a  higher  fuel  rate  to 
compensate.  If  two  or  more  sensors  fail,  the  engine  is  shut  down,  since  the  system  cannot  reliably 
control  the  air-fuel  ratio. 

The  Stateflow  control  logic  of  the  system  has  a  total  of  24  locations,  grouped  in  6  parallel  (i.e., 
simultaneously  active)  states.  The  Simulink  part  of  the  system  is  described  by  several  nonlinear 
equations  and  a  linear  differential  equation  with  a  switching  condition.  Overall,  this  model  pro¬ 
vides  a  representative  summary  of  the  important  features  of  hybrid  systems.  Our  stochastic  system 
is  obtained  by  introducing  random  faults  in  the  EGO,  speed  and  manifold  pressure  sensors.  We 
model  the  faults  by  three  independent  Poisson  processes  with  different  arrival  rates.  When  a  fault 
happens,  it  is  “repaired”  with  a  fixed  service  time  of  one  second  (i.e.  the  sensor  remains  in  fault 
condition  for  one  second,  then  it  resumes  normal  operation).  Note  that  the  system  has  no  free 
inputs,  since  the  throttle  command  provides  a  periodic  triangular  input,  and  the  nominal  speed  is 
never  changed.  This  ensures  that,  once  we  set  the  three  fault  rates,  for  any  given  temporal  logic 
property  (|>  the  probability  that  the  model  satisfies  (f>  is  well-defined.  All  our  experiments  have 
been  performed  on  a  2.4GHz  Pentium  4,  1GB  RAM  desktop  computer  running  Matlab  R2008b  on 
Windows  XP. 

6.1  Experimental  Results  in  Application 

For  our  experiments  we  model  check  the  following  formula  (null  hypothesis) 

H0  :  M  |=  P>o{^¥mGl{FuelFlowRate  =  0))  (18) 

for  different  values  of  threshold  0  and  sensors  fault  rates.  We  test  whether  with  probability  greater 
than  0  it  is  not  the  case  that  within  100  seconds  the  fuel  flow  rate  stays  zero  for  one  second.  The 
fault  rates  are  expressed  in  seconds  and  represent  the  mean  interarrival  time  between  two  faults 
(in  a  given  sensor).  In  experiment  1,  we  use  uniform  priors  over  (0, 1),  with  null  and  alternate 
hypotheses  equally  likely  a  priori.  In  experiment  2,  we  use  informative  priors  highly  concentrated 
around  the  true  probability  of  the  model  satisfying  the  BLTL  formula.  The  Bayes  Factor  threshold 
is  T  =  1000,  so  by  Theorem  1  both  Type  I  and  II  errors  are  bounded  by  .001. 

2More  information  on  the  model  is  available  at  http :  //mathworks  .  com/products/simulink/ 
demos  .  html?f  ile=/ products/  demos /shipping/  simulink/  sldemo_fuelsys  .  html  . 
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Probability  threshold  0 

.5 

.7 

.8 

.9 

.99 

Fault 

rates 

(3  7  8) 

X  (5  8/ 124s) 

X  (17/40s) 

X  (10/25s) 

X (8/2  Is) 

X  (2/5  s) 

(10  8  9) 

/  (32/78s) 

/  (95/225s) 

/  (39471013s) 

X  (7 10/1 738s) 

X  (8/2 Is) 

(20  10  20) 

/  (9/2 Is) 

/  (16/36s) 

/  (24/54s) 

/  (44/1 00s) 

X  (1626/3995s) 

(30  30  30) 

/  (9/24s) 

/(16/41s) 

/  (24759s) 

/  (44/107s) 

/  (23  975  89s) 

Table  1:  Number  of  samples  /  verification  time  when  testing  (18)  with  uniform,  equally  likely  priors  and 
T  =  1000:  X  =  'Hq  rejected’,  /  =  ‘//q  accepted’. 


Probability  threshold  0 

.5 

.7 

.8 

.9 

.99 

Fault 

rates 

(3  7  8) 

X  (55/1 17s) 

X  (12/28s) 

X  (10725  s) 

X  (8/2 Is) 

X (2/5 s) 

(10  8  9) 

/  (28/69s) 

/  (64/1 50s) 

/  (347/876s) 

X  (255/632s) 

X  (8/2 Is) 

(20  10  20) 

/  (8/1 8s) 

/  (13/30s) 

/  (20/45 s) 

/  (39/88s) 

X  (1463/3613s) 

(30  30  30) 

/  (7/1 8s) 

/  (13734s) 

/(18745s) 

/  (33/80s) 

/  (2017502s) 

Table  2:  Number  of  samples  /  verification  time  when  testing  (18)  with  informative  priors  and  T  =  1000:  X 
=  'H0  rejected’,  /  =  ‘Hq  accepted’. 


In  Table  1  and  Table  2  we  report  our  results.  Even  the  longest  test  (for  0  =  .99  and  fault  rates 
(20  10  20)  in  Table  1)  Bayesian  SMC  terminates  after  3995s  already.  This  is  very  good  perfor¬ 
mance  for  a  test  with  such  a  small  (.001)  error  probability  run  on  a  standard  desktop  computer.  We 
note  the  total  time  spent  for  this  case  on  actually  computing  the  statistical  test  i.e.,  Bayes  factor 
computation,  was  just  about  Is.  Also,  by  comparing  the  numbers  of  Table  1  and  2  we  note  that 
the  use  of  an  informative  prior  generally  helps  the  algorithm  -  i.e.,  fewer  samples  are  required  to 
decide. 

Next,  we  estimate  the  probability  that  M  satisfies  the  following  property,  using  our  Bayesian 
estimation  algorithm: 

M  h  (-F100G \FuelFlowRate  =  0))  .  (19) 

In  particular,  we  ran  two  sets  of  tests,  one  with  half-interval  size  8  =  .05  and  another  with  8  =  .01. 
In  each  set  we  used  different  values  for  the  interval  coefficient  c  and  different  sensor  fault  rates,  as 
before.  Experimental  results  are  in  Table  3  and  4.  We  used  uniform  priors  in  both  cases. 

6.2  Discussion 

A  general  trend  shown  by  our  experimental  results  and  additional  simulations  is  that  our  Bayesian 
estimation  model  checking  algorithm  is  generally  faster  at  the  extremes,  i.e.,  when  the  unknown 
probability  p  is  close  to  0  or  close  to  1.  Performance  is  worse  when  p  is  closer  to  0.5.  In  contrast, 
the  performance  of  our  Bayesian  hypothesis  testing  model  checking  algorithm  is  faster  when  the 
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Interval  coverage  c 

.9 

.95 

.99 

.999 

Fault 

rates 

(3  7  8) 

.4/258 

.376/357 

.3569  /  606 

.3429  /  972 

(10  8  9) 

.8857/  103 

.8904  /  144 

.8785/286 

.8429  /  590 

(20  10  20) 

.9565/21 

.9667  /  28 

.9561  /  112 

.9625/  158 

(30  30  30) 

.9565/21 

.9667  /  28 

.9778/43 

.9851  /65 

samples  needed  in  [12] 

4793 

5902 

8477 

12161 

Table  3:  Posterior  mean  /  number  of  samples  for  estimating  probability  of  (19)  with  uniform  prior  and 
8  =  .05,  and  comparison  with  the  samples  needed  by  the  Chernoff-Hoeffding  bound. 


Interval  coverage  c 

.9 

.95 

.99 

.999 

Fault 

rates 

(3  7  8) 

.3603/6234 

.3559/8802 

.3558/15205 

.3563/24830 

(10  8  9) 

.8534/3381 

.8518/4844 

.8528/8331 

.8534/13569 

(20  10  20) 

.9764/592 

.9784/786 

.9840/1121 

.9779/2583 

(30  30  30) 

.9913/113 

.9933/148 

.9956/227 

.9971/341 

samples  needed  in  [12] 

119829 

147555 

211933 

304036 

Table  4:  Posterior  mean  /  number  of  samples  when  estimating  probability  of  (19)  with  uniform  prior  and 
8  =  .01,  and  comparison  with  the  samples  needed  by  the  Chernoff-Hoeffding  bound. 

unknown  true  probability  p  is  far  from  the  threshold  probability  0. 

We  note  the  remarkable  performance  of  our  estimation  approach  compared  to  the  technique 
based  on  the  Chernoff-Hoeffding  bound  [12].  From  Table  3  and  4  we  see  that  when  the  unknown 
probability  is  close  to  1 ,  our  algorithm  can  be  between  two  and  three  orders  of  magnitude  faster. 
(The  same  argument  holds  when  the  true  probability  is  close  to  0.)  Chernoff-Hoeffding  bounds 
hold  for  any  random  variable  with  bounded  variance.  Our  Bayesian  approach,  instead,  explicitly 
builds  the  posterior  distribution  on  the  basis  of  the  Bernoulli  sampling  distribution  and  the  prior. 

6.3  Performance  Evaluation 

We  have  conducted  a  series  of  simulations  to  analyze  the  performance  (measured  as  number  of 
samples)  of  our  sequential  Bayesian  estimation  algorithm  with  respect  to  the  unknown  probability 
p.  In  particular,  we  have  run  simulations  for  values  of  p  ranging  from  .01  to  .99,  with  coverage  (c) 
of  .9999  and  .99999,  interval  half-size  (8)  of  .001  and  .005,  and  uniform  prior.  We  present  details 
of  our  simulations  in  Figure  1 . 

Our  Simulink  experiments  show  that  Bayesian  estimation  is  very  fast  when  p  is  close  to  either 
0  or  1,  while  the  algorithm  needs  a  larger  number  of  samples  when  p  is  close  to  In  a  sense, 
our  algorithm  can  decide  easier  PMC  instances  faster:  if  the  probability  p  of  a  formula  being  true 
is  very  small  or  very  large,  we  need  fewer  samples.  This  is  another  advantage  of  our  approach 
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Figure  1:  Performance  of  Bayesian  estimation:  number  of  samples  vs  probability 


that  it  is  not  currently  matched  by  other  SMC  estimation  techniques  (e.g.,  [12]).  Our  findings  are 
consistent  with  those  of  Yu  et  al.  for  the  VLSI  testing  domain  [29]. 

Our  simulations  also  indicate  that  the  performance  of  the  algorithm  depends  more  strongly  on 
the  half-size  8  of  the  estimated  interval  than  on  the  coverage  c  of  the  interval  itself.  It  is  much 
faster  to  estimate  an  interval  of  half-size  8  =  .005  with  coverage  c  —  .99999  than  it  is  to  estimate 
an  interval  of  8  =  .001  with  c  —  .9999.  More  theoretical  work  is  needed,  however,  to  understand 
fully  the  behavior  of  the  Bayesian  sequential  estimation  algorithm.  Our  initial  findings  suggest  that 
the  algorithm  scales  very  well. 


7  Related  Work 

Younes  and  Simmons  introduced  the  first  algorithm  for  Statistical  Model  Checking  [27,  28].  Their 
work  uses  the  SPRT  [25],  which  is  designed  for  simple  hypothesis  testing3.  Specifically,  the  SPRT 
decides  between  the  simple  null  hypothesis  Hq  :  fM  \—  P=e0((|))  against  the  simple  alternate  hy¬ 
pothesis  H[  :  i M  \=  P=e1  (<])) ,  where  Go  <  0i-  The  SPRT  is  optimal  for  simple  hypothesis  testing, 

3A  simple  hypothesis  completely  specifies  a  distribution.  For  example,  a  Bernoulli  distribution  of  parameter  p  is 
fully  specified  by  the  hypothesis  p  =  0.3  (or  some  other  numerical  value).  A  composite  hypothesis,  instead,  still  leaves 
the  free  parameter  p  in  the  distribution.  This  results,  e.g.,  in  a  family  of  Bernoulli  distributions  with  parameter  p  <  0.3. 
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since  it  minimizes  the  expected  number  of  samples  among  all  the  tests  satisfying  the  same  Type 
I  and  II  errors,  when  either  H'{)  or  H[  is  true  [25].  The  PMC  problem  is  instead  a  choice  between 
two  composite  hypotheses  Hq  :  M  |=  P>q(§)  versus  H\  :  M  \=  P<  e((|)).  The  SPRT  is  not  defined 
unless  0o  f  0i,  so  Younes  and  Simmons  overcome  this  problem  by  separating  the  two  hypotheses 
by  an  indifference  region  (0  —  8, 0  +  8),  inside  which  any  answer  is  tolerated.  Here  0  <  8  <  1  is  a 
user-specified  parameter.  It  can  be  shown  that  the  SPRT  with  indifference  region  can  be  used  for 
testing  composite  hypotheses,  while  respecting  the  same  Type  I  and  II  errors  of  a  standard  SPRT 
[9,  Section  3.4].  However,  in  this  case  the  test  is  no  longer  optimal,  and  the  maximum  expected 
sample  size  may  be  much  bigger  than  the  optimal  fixed-size  sample  test  -  see  [4]  and  [9,  Section 
3.6].  Our  approach  solves  instead  the  composite  hypothesis  testing  problem,  with  no  indifference 
region. 

The  method  of  [12]  uses  a  fixed  number  of  samples  and  estimates  the  probability  that  the 
property  holds  as  the  number  of  satisfying  traces  divided  by  the  number  of  sampled  traces.  Their 
algorithm  guarantees  the  accuracy  of  the  results  using  Chemoff-Hoeffding  bounds.  In  particular, 
their  algorithm  can  guarantee  that  the  difference  in  the  estimated  and  the  true  probability  is  less 
than  £,  with  probability  p,  where  p  <  1  and  8  >  0  are  user-specified  parameters.  Our  experimental 
results  show  a  significant  advantage  of  our  Bayesian  estimation  algorithm  in  the  sample  size. 

Grosu  and  Smolka  use  a  standard  acceptance  sampling  technique  for  verifying  formulas  in 
LTL  [10].  Their  algorithm  randomly  samples  lassos  (i.e.,  random  walks  ending  in  a  cycle)  from  a 
Biichi  automaton  in  an  on-the-fly  fashion.  The  algorithm  terminates  if  it  finds  a  counterexample. 
Otherwise,  the  algorithm  guarantees  that  the  probability  of  finding  a  counterexample  is  less  than  8, 
under  the  assumption  that  the  true  probability  that  the  LTL  formula  is  true  is  greater  than  £  (8  and 
£  are  user-specified  parameters). 

Sen  et  al.  [23]  used  the  p-value  for  the  null  hypothesis  as  a  statistic  for  hypothesis  testing.  The 
p- value  is  defined  as  the  probability  of  obtaining  observations  at  least  as  extreme  as  the  one  that 
was  actually  seen,  given  that  the  null  hypothesis  is  true.  It  is  important  to  realize  that  a  p-value 
is  not  the  probability  that  the  null  hypothesis  is  true.  Sen  et  aids  method  does  not  have  a  way  to 
control  the  Type  I  and  II  errors.  Sen  et  al.  [24]  have  started  investigating  the  extension  of  SMC 
to  unbounded  (i.e.,  standard)  LTL  properties.  Finally,  Langmead  [18]  has  applied  Bayesian  point 
estimation  and  SMC  for  querying  Dynamic  Bayesian  Networks. 


8  Conclusions  and  Future  Work 

Extending  our  Statistical  Model  Checking  (SMC)  algorithm  that  uses  Bayesian  Sequential  Hypoth¬ 
esis  Testing,  we  have  introduced  the  first  SMC  algorithm  based  on  Bayesian  Interval  Estimation. 
For  both  algorithms,  we  have  proven  analytic  bounds  on  the  probability  of  returning  an  incorrect 
answer,  which  are  crucial  for  understanding  the  outcome  of  Statistical  Model  Checking.  We  have 
used  SMC  for  Stateflow/Simulink  models  of  a  fuel  control  system  featuring  fault-tolerance  and 
hybrid  behavior.  Because  verification  is  fast  in  most  cases,  we  expect  SMC  methods  to  enjoy  good 
scalability  properties  for  larger  Stateflow/Simulink  models.  Our  Bayesian  estimation  is  orders  of 
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magnitudes  faster  than  previous  estimation-based  model  checking  algorithms. 
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